Genome nucleotide composition shapes variation in simple sequence repeats.

نویسندگان

  • Xiangjun Tian
  • Joan E Strassmann
  • David C Queller
چکیده

Simple sequence repeats (SSRs) or microsatellites are a common component of genomes but vary greatly across species in their abundance. We tested the hypothesis that this variation is due in part to AT/GC content of genomes, with genomes biased toward either high AT or high CG generating more short random repeats that are long enough to enhance expansion through slippage during replication. To test this hypothesis, we identified repeats with perfect tandem iterations of 1-6 bp from 25 protists with complete or near-complete genome sequences. As expected, the density and the frequency are highly related to genome AT content, with excellent fits to quadratic regressions with minima near a 50% AT content and rising toward both extremes. Within species, the same trends hold, except the limited variation in AT content within each species places each mainly on the descending (GC rich), middle, or ascending (AT rich) part of the curve. The base usages of repeat motifs are also significantly correlated with genome nucleotide compositions: Percentages of AT-rich motifs rise with the increase of genome AT content but vice versa for GC-rich subgroups. Amino acid homopolymer repeats also show the expected quadratic relationship, with higher abundance in species with AT content biased in either direction. Our results show that genome nucleotide composition explains up to half of the variance in the abundance and motif constitution of SSRs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Independence of color intensity variation in red flesh apples from the number of repeat units in promoter region of the MdMYB10 gene as an allele to MdMYB1 and MdMYBA

MdMYB10 gene expression results in accumulation of anthocyanin in many tissues including flesh of applefruit. The MdMYB1 and MdMYBA genes are close homologues to MdMYB10 gene and both are responsiblefor red color phenotype in apple fruit skin. In the current study, an apple genome sequence draft analysisindicated that these three genes are located in a unique contig. Further a...

متن کامل

Genome Variability and Gene Content in Chordopoxviruses: Dependence on Microsatellites

To investigate gene loss in poxviruses belonging to the Chordopoxvirinae subfamily, we assessed the gene content of representative members of the subfamily, and determined whether individual genes present in each genome were intact, truncated, or fragmented. When nonintact genes were identified, the early stop mutations (ESMs) leading to gene truncation or fragmentation were analyzed. Of all th...

متن کامل

Mapping and analysis of Simple Sequence Repeats in the Arabidopsis thaliana Genome

Simple sequence repeats (SSRs) are becoming standard DNA markers for plant genome analysis and are being used as markers in marker assisted breeding. And hence because of its great significance we have initiated this study to analyze complete genome of Arabidopsis thaliana for the prevalence of mono-, di-, tri-, tetra-, penta- and hexa- mer repeats in the coding and non-coding regions of the ch...

متن کامل

PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription...

متن کامل

Differential distribution of simple sequence repeats in eukaryotic genome sequences.

Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Molecular biology and evolution

دوره 28 2  شماره 

صفحات  -

تاریخ انتشار 2011